graphic with four colored squares
Cover page images (keys)

Mission Possible: Deploying Government Linked Data (Pt2)

Sandro Hawke, (sandro@w3.org), W3C/MIT, @sandhawke
John L. Sheridan, @johnlsheridan
gov 2.0 expo, May 25-26, 2010, Washington DC
http://www.w3.org/2010/Talks/0525-rdf-vocabularies (wiki)

Part 2

Viewing Your Data as Triples

  1. Good URIs
  2. Properties (relationships and attributes)
  3. Competition and Overlap
  4. Classes
  5. Vocabularies

Vocabulary as Interface

How do programs communicate via triples?

Alice and Bob publish triples, Charlie's software tries to use their data.

It's all about the specific URIs, the vocabulary.

Essentially, when using RDF, the vocabulary is the syntax, the API.

Example: Crime Reports

@@@ example of integrating crime data from multiple agencies

Identify Your Subjects

What are the things your data is about?

the items, entities, objects, individuals, ...

See UML, Database Records, your web site

Assign Good URIs

give them good, long-term URI names

Pick URIs that:

Such as:

See Designing URI Sets for the UK Public Sector

Properties

Essential to understand a triple, the middle part

Also known as:

triple.png

@@@ maybe SQL example

Property As Question

Each triple is the answer to a question.

Object and Data Properties

A Data Property:

An Object Property:

Data Types

@@@ xml data types

Some Well-Known Properties

@@@ make these be links to real documentation

rdfs:label

rdfs:comment

owl:sameAs

foaf:name

dc:creator

Overlapping, Competing Vocabularies

two terms for the same thing

two terms for very similar things

owl:sameAs

owl:equivalentProperty

Subproperty

(dc refines)

rdfs:subPropertyOf

geographically near, overlapping, contained-within

Messy Overlap

Not all related properties are equiv or sub

foaf:firstName, foaf:givenName, foaf:lastName, foaf:familyName, foaf:name

Compare: http://dbpedia.org/resource/Barack_Obama http://www.cyc.com/2004/06/04/cyc#UnitedStatesPresident (@eh,that's just a class)

Conversion Rules

        if { ?x foaf:firstName ?first;
                foaf:lastName ?last }
        then
           { ?x foaf:familyName ?last;
                foaf:givenName ?first;
                foaf:name func:string-join(?first " " ?last)
           }

        if { ?x foaf:name ?name } and
           pred:contains(?name, " ")
        then
           # incorrect if lastname has space, like Hillary Rodham Clinton
           { ?x foaf:firstName func:string-before(?name, " ");
                foaf:lastName func:string-after(?name, " ")
           }

See Rule Interchange Format

Applying Rules

Advice?

The world is full of competing standards. That's good, but painful.

Probably best to follow Postel's Law:

 Be conservative in what you do; be liberal in what you accept from others.

Research topic: automatic downloading of conversion rules

Classes and Subclasses

Sets of objects with something in common.

Instances / rdf:type

Subclass hierarchy

plant / large_plant / tree / mature_horse_chestnut, the one in my back yard

Domain and Range

domain: the class of things which might have this property.

range: the class of possible values for this property

@@ table of domain and range for various example properties

OWL

Powerful way of declaring how properties, classes, and individuals relate to each other.

"Ontologies"

http://www.w3.org/TR/owl2-primer/

OWL can be conveyed in triples, but also has some easier-to-read syntaxes. I suggest Manchester, when you don't need triples.

Inference

machines ("reasoners") can process these ontologies

given:

they will infer

Which is great if you're querying for child and you have some parent data.

Also helps find errors in data and modeling.

http://www.w3.org/2001/sw/wiki/Category:Reasoner

SKOS

A less formal way to document your URIs.

Everything is a Concept. General broader/narrower.

Good when you want to quickly leverage existing controlled vocabulary.

http://www.w3.org/TR/2009/NOTE-skos-primer-20090818/

Finding Vocabularies

Falcons

watson

swse

sindice

sameas.org

http://www.w3.org/2001/sw/wiki/Category:Search_Engine

Browsing Vocabularies

use the HTML documentation

use an ontology viewer http://www.w3.org/2001/sw/wiki/Category:Visualizer

Creating Vocabularies

text editor

protege

topbraid composer

neologism

http://www.w3.org/2001/sw/wiki/Category:Editor

Evolving Vocabularies

Decide (and tell people) which terms are stable.

Good Modeling

  1. Which items are you communicating about?
  2. What are the logical groups (classes) of those items?
  3. What properties can each kind of item have?